Evaluating Answer Extraction for Why-QA using RST-annotated Wikipedia texts
نویسنده
چکیده
In this paper the research focus is on the task of answer extraction for why-questions. As opposed to techniques for factoid QA, finding answers to whyquestions involves exploiting text structure. Therefore, we approach the answer extraction problem as a discourse analysis task, using Rhetorical Structure Theory (RST) as framework. We evaluated this method using a set of why-questions that have been asked to the online question answering system answers.com with a corpus of answer fragments from Wikipedia, manually annotated with RST structures. The maximum recall that can be obtained by our answer extraction procedure is about 60%. We suggest paragraph retrieval as supplementary and alternative approach to RST-based answer extraction.
منابع مشابه
Exploring the use of linguistic analysis for answering why- questions
In the current project, we aim at developing an approach for automatically answering whyquestions (why-QA). In the present paper, we investigate the relevance of linguistic analysis for why-QA. We focus on two tasks: the use of syntactic information for answer type determination and the use of discourse structure for the extraction of possible answers from retrieved documents. For answer type d...
متن کاملUse of Linguistic Analysis for Answering Why-Questions
In the current project, we aim at developing an approach for automatically answering whyquestions (why-QA). In the present paper, we investigate the relevance of linguistic analysis for why-QA. We focus on two tasks: the use of syntactic information for answer type determination and the use of discourse structure for the extraction of possible answers from retrieved documents. For answer type d...
متن کاملLeveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering
Most existing Question Answering (QA) systems adopt a type-and-generate approach to candidate generation that relies on a pre-defined domain ontology. This paper describes a type independent search and candidate generation paradigm for QA that leverages Wikipedia characteristics. This approach is particularly useful for adapting QA systems to domains where reliable answer type identification an...
متن کاملUnderstanding questions and finding answers: semantic relation annotation to compute the Expected Answer Type
The paper presents an annotation scheme for semantic relations developed and used for question classification and answer extraction in an interactive dialogue based quiz game. The information that forms the content of this game is concerned with biographical facts of famous people’s lives and is often available as unstructured texts on internet, e.g. Wikipedia collection. Questions asked as wel...
متن کاملGIRSA-WP at GikiCLEF: Integration of Structured Information and Decomposition of Questions
This paper describes the current GIRSA-WP system and the experiments performed for GikiCLEF 2009. GIRSA-WP (GIRSA for Wikipedia) is a fully-automatic, hybrid system combining methods from question answering (QA) and geographic information retrieval (GIR). It merges results from InSicht, a deep (text-semantic) open-domain QA system, and GIRSA, a system for textual GIR. For the second participati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007